Automatically Segmenting Oral History Transcripts

نویسنده

  • Ryan Shaw
چکیده

Dividing oral histories into topically coherent segments can make them more accessible online. People regularly make judgments about where coherent segments can be extracted from oral histories. But making these judgments can be taxing, so automated assistance is potentially attractive to speed the task of extracting segments from openended interviews. When different people are asked to extract coherent segments from the same oral histories, they often do not agree about precisely where such segments begin and end. This low agreement makes the evaluation of algorithmic segmenters challenging, but there is reason to believe that for segmenting oral history transcripts, some approaches are more promising than others. The BayesSeg algorithm performs slightly better than TextTiling, while TextTiling does not perform significantly better than a uniform segmentation. BayesSeg might be used to suggest boundaries to someone segmenting oral histories, but this segmentation task needs to be better defined.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Segmenting Oral History Transcripts

Dividing oral histories into topically coherent segments can make them more accessible online. People regularly make judgments about where coherent segments can be extracted from oral histories. But when different people are asked to extract coherent segments from the same oral histories, they often do not agree about where such segments begin and end.

متن کامل

Segmentation of Automatically Transcribed Broadcast News Text

Expertise in the automatic transcription of broadcast speech has progressed to the point of being able to use the resulting transcripts for information retrieval purposes. In this paper, we describe the Segmentation system used by Dragon Systems in the Segmentation task of the 1998 TDT evaluation, highlighting improvements made since the September 1998 dryrun. Segmentation of closed-caption and...

متن کامل

a Spoken Document Retrieval Application in the Oral History Domain

The application of automatic speech recognition in the broadcast news domain is well studied. Recognition performance is generally high and accordingly, spoken document retrieval can successfully be applied in this domain, as demonstrated by a number of commercial systems. In other domains, a similar recognition performance is hard to obtain, or even far out of reach, for example due to lack of...

متن کامل

Making oral history accessible over the World Wide Web

We describe a multimedia, WWW-based oral history collection constructed from off-the-shelf or publicly available software. The source materials for the collection include audio tapes of interviews and summary transcripts of each interview, as well as photographs illustrating episodes mentioned in the tapes. Sections of the transcripts are manually matched to associated segments of the tapes, an...

متن کامل

Sensitometric characteristics of D-, E- and F-speed dental radiographic films in manual and automatic processing

BACKGROUND: The purpose of this study was to evaluate the sensitometric characteristics of Ultraspeed, Ektaspeed Plus and Insight dental radiographic films using manual and automatic processing systems. METHODS: In this experimental invitro study, an aluminum step-wedge was used to construct characteristic curves for D-, E- and F-speed radiographic films (Kodak Eastman, Rochester, USA). All fil...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1509.08842  شماره 

صفحات  -

تاریخ انتشار 2015